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Abstract — In this paper we will review the developed data 
mining system. We have developed the data mining system 
under two areas i.e “Product” & “Share”. For the category 
“Product” , we are analyzing Product under three different 
sub-category i.e, ‘Product Purchsed’, ‘Customer Points’, 
‘Customer Bills’ & for the category “Share” , we are analyzing 
Share under three different sub-category i.e, ‘Share in Demand’, 
‘Share Price’. 

Index Terms — Data Mining, Data Mining System 

I. INTRODUCTION 

Data mining refers to extracting or ‘mining’ interesting 
knowledge from large amounts of data [1]. It provides a 
means of extracting previously unknown, predictive infor¬ 
mation from the base of accessible data in data warehouses. 
Data mining tools use sophisticated, automated algorithms to 
discover hidden patterns, correlations, and relationships 
among organizational data. These tools are used to predict 
future trends and behaviors, allowing businesses to make 
proactive, knowledge-driven decisions [2]. 

Data Mining (sometimes called data or knowledge 
discovery) is the process of analyzing data from different 
perspectives and summarizing it into useful information - 
information that can be used to increase revenue, cuts costs, or 
both. Data mining software is one of a number of analytical 
tools for analyzing data. It allows users to analyze data from 
many different dimensions or angles, categorize it, and 
summarize the relationships identified. Technically, data 
mining is the process of finding correlations or patterns 
among dozens of fields in large relational databases. 

II. Process 

The process of data mining consists of three stages: 

1. The initial exploration. 

2. Model building or pattern identification with 
validation/verification. 

3. Deployment (i.e., the application of the model to new 
data in order to generate predictions). 
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Figure 1, Diagram of Data Mining Process 


Stage 1: Exploration. 

This stage usually starts with data preparation which may 
involve cleaning data, data transformations, selecting subsets 
of records and - in case of data sets with large numbers of 
variables ("fields") - performing some preliminary feature 
selection operations to bring the number of variables to a 
manageable range (depending on the statistical methods 
which are being considered). Then, depending on the nature 
of the analytic problem, this first stage of the process of data 
mining may involve anywhere between a simple choice of 
straightforward predictors for a regression model, to elaborate 
exploratory analyses using a wide variety of graphical and 
statistical methods (see Exploratory Data Analysis (EDA)) in 
order to identify the most relevant variables and determine the 
complexity and/or the general nature of models that can be 
taken into account in the next stage. 

Stage 2: Model building and validation. 

This stage involves considering various models and 
choosing the best one based on their predictive performance 
(i.e., explaining the variability in question and producing 
stable results across samples). This may sound like a simple 
operation, but in fact, it sometimes involves a very elaborate 
process. There are a variety of techniques developed to 
achieve that goal - many of which are based on so-called 
"competitive evaluation of models," that is, applying different 
models to the same data set and then comparing their 
performance to choose the best. These techniques - which are 
often considered the core of predictive data mining - include: 
Bagging (Voting, Averaging), Boosting, Stacking (Stacked 
Generalizations), and Meta-Learning. 

Stage 3: Deployment. 
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That final stage involves using the model selected as best in 
the previous stage and applying it to new data in order to 
generate predictions or estimates of the expected outcome. 

The concept of Data Mining is becoming increasingly 
popular as a business information management tool where it is 
expected to reveal knowledge structures that can guide 
decisions in conditions of limited certainty. Recently, there 
has been increased interest in developing new analytic 
techniques specifically designed to address the issues relevant 
to business Data Mining (e.g., Classification Trees), but Data 
Mining is still based on the conceptual principles of statistics 
including the traditional Exploratory Data Analysis (EDA) 
and modeling and it shares with them both some components 
of its general approaches and specific techniques. 
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Fig 2, Gateway of Data Mining System 



III. Prior Work 

In the field of data mining the prior work which has been 
done in the area of data mining are as follows:- 

ADAM : ADaM is a data mining toolkit designed for use 
with scientific and image data. It includes pattern recognition, 
image processing, optimization, and association rule mining 
capabilities. ADaM does not contain grid projection, 
advanced subsetting, advanced statistical analysis, format 
conversion, visualization or other types of tools that may be 
useful in the analysis of scientific data sets. The system 
consists of a set of individual components that can be used 
together to perform complex tasks. 

AlphaMiner : AlphaMiner is developed by the E-Business 
Technology Institute (ETI) of the University of Hong Kong 
under the support from the Innovation and Technology Fund 
(ITF) of the Government of the Hong Kong Special 
Administrative Region (HKSAR). It is an open source data 
mining platform that provides the best cost and-performance 
ratio for data mining applications. 

KNIME : KNIME is the leading open platform for 
data-driven innovation helping organizations to stay ahead of 
change. Use our open-source, enterprise-grade analytics 
platform to discover the potential hidden in your data, mine 
for fresh insights or predict new futures. 

IV. Propose Work 

In this research work, we aimed to develop a Data Mining 
application for two areas i.e for Product and Share. It was 
aimed to develop an Open Source based Data Mining 
Application .The Developed Application will work on any of 
the browser and the user of the application can predict which 
product is under requirement and which is not under 
requirement, which product is in most demanded by the 
consumer who are coming for purchase, which customer is 
having the maximum points and who is having the minimum 
one, which share is in demand and which share is of high coast 

V. Result 

The developed Data Mining system using Open Source, 
which can be shown in the below figure. By the use of 
developed system the user can take managerial decision , 
which is ultimate goal of any Developed Data Mining System. 


VI. Discussion 

In this research work, we explored the problem of Data 
Mining that how we can store the data, how to retrieve record 
on front so that it can be proven beneficial to take any 
managerial decision. 

The Developed data mining system will decide which 
product is in demand & according to which the data is mined 
from the stored data for that product, what is the actual profit 
& loss, who is the competitor of this product, what wrong , 
where the company or industry is lacking, what to do to 
increase the consumption, how to launch any product. It will 
be very helpful to industry who is interested in doing market 
analysis in the particular field. It is very fast in processing and 
user using that developed system can get the result quickly. 
The developed system can be used for online information 
extraction by implementing the developed data mining system 
online. 

VII. Conclusion 

As we can see in the above fig 2, the first page of the 
developed Data Mining System. Into the developed system 
we can do the mining into two areas i.e, Product & Share. In 
both the area we will get refined information to take any 
decision. 
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